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PREFACE 

RAND  is  helping  to  design  an  Enlisted  Force  Management  System 
(EFMS)  for  the  Air  Force.1  The  EFMS  is  a  decision  support  system 
designed  to  assist  managers  of  the  enlisted  force  in  setting  and  meeting 
force  targets.  The  system  contains  computer  models  that  project  the 
force  resulting  from  given  management  actions,  so  actions  that  meet 
targets  can  be  found.  Some  of  those  models  analyze  separate  job 
specialties  (disaggregate  models)  and  others  analyze  the  total  enlisted 
force  across  all  specialties  (aggregate  models);  some  models  make  annual 
projections  (middle-term  models)  and  others  make  monthly  projections. 

The  Short-Term  Aggregate  Inventory  Projection  Model  (SAM)  is  the 
component  of  the  EFMS  that  makes  monthly  projections  (for  the  rest  of 
the  current  fiscal  year)  of  the  aggregate  enlisted  force.  The  overall 
SAM  model  contains  five  modules: 

Module  P:  Preprocessor. 

Module  1:  Separation  Projection. 

Module  2:  Inventory  and  Cost  Projection. 

Module  3:  Computer  Aided  Design. 

Module  4:  Plan  Comparison. 

SAM  is  documented  in  C.  Peter  Rydell  and  Kevin  L.  Lawson, 

Short-term  Aggregate  Model  for  Projecting  Air  Force  Enlisted  Personnel 
(SAM),  RAND,  N-3166-AF,  1991.  That  Note  gives  detailed  specifications 
for  modules  P  and  2  through  4.  Module  1  (the  Separation  Projection 
module)  projects  monthly  loss  and  reenlistment  behavior.  The  detailed 
specifications  for  alternative  versions  of  Module  1  are  presented  in 
separate  publications.  These  describe  three  promising  methods  of 
predicting  the  separations  required  from  Module  1: 

xFor  an  overview  of  the  EFMS  see  Grace  Carter,  Jan  Chaiken,  Michael 
Murray,  and  Warren  Walker,  Conceptual  Design  of  an  Enlisted  Force 
Management  System  for  the  Air  Force ,  RAND,  N-2005-AF,  August  1983. 
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•  Time  series  forecasting. 

•  Robust  separation  projection. 

•  Benchmark  separation  projection. 

All  three  methods  predict  the  monthly  losses  and  reenlistment  flows 
that  are  needed  as  inputs  to  Module  2.  They  predict  "policy-free" 
flows--the  losses  and  reenlistments  that  would  occur  in  the  absence  of 
early  release  and  early  reenlistment  programs.  (Module  2  accounts  for 
the  effect  of  past  and  present  management  actions  on  losses  and 
reenlistments.)  However,  in  spite  of  having  the  same  objectives  the 
three  methods  differ  fundamentally  in  the  way  they  accomplish  those 
objectives. 

The  time  series  forecasting  method  uses  models  such  as  constant 
rate,  regression,  autoregressive,  and  straight  line  running  average. 
These  models  are  documented  in  Marygail  K.  Brauner,  Kevin  L.  Lawson, 
William  T.  Mickelson,  Joseph  Adams,  and  Jan  M.  Chaiken,  Time  Series 
Models  for  Predicting  Monthly  Losses  of  Air  Force  Enlisted  Personnel , 
RAND,  N-3167-AF,  1991. 

The  robust  separation  projection  method  uses  data  on  past  losses 
and  reenlistments  to  estimate  separation  rates  for  a  model  that  predicts 
loss  and  reenlistment  flows  one  month  at  a  time  for  each  of  a  mutually 
exclusive  set  of  about  500  cohorts.  After  these  flows  are  predicted  for 
a  projection  month,  the  inventory  is  updated  and  the  models  are  applied 
to  the  updated  inventories  to  predict  the  flows  for  the  following  month. 
This  process  is  repeated  until  the  inventory  for  the  last  month  of  the 
fiscal  year  is  projected.  Thus,  it  applies  separation  rates  to  a  series 
of  different  inventories.  The  robust  method  is  specified  in  this  Note. 

The  benchmark  separation  projection  (BSP)  method  uses  data  on  past 
losses  and  reenlistments  to  estimate  a  set  of  separation  rates  for  each 
month  of  the  fiscal  year  for  a  mutually  exclusive  set  of  about  280 
"decision  groups."  Those  separation  rates  are  then  applied  to  the 
current  inventory  to  predict  monthly  loss  and  reenlistment  flows  for  the 
rest  of  the  fiscal  year.  Thus,  the  BSP  method  applies  different  sets  of 
separation  rates  to  a  single  inventory  (that  single  inventory  is  the 
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inventory  at  the  start  of  the  projection  period).  The  BSP  method  is 
documented  in  C.  Peter  Rydell  and  Kevin  L.  Lawson,  The  Benchmark 
Separation  Projection  Method  for  Predicting  Monthly  Losses  of  Air  Force 
Enlisted  Personnel ,  RAND,  N-3168-AF,  1991. 

The  names  "robust"  and  "benchmark"  are  historical  artifacts. 
"Robust"  refers  to  a  particular  method  of  averaging  past  separation 
rates  that  is  not  unduly  influenced  by  outliers  in  the  historical  data. 
"Benchmark"  refers  to  the  method's  original  purpose:  to  serve  as  a 
standard  of  comparison  for  the  accuracy,  reliability,  and  runtime  of 
alternative  methods  for  Module  1.  The  benchmark  model  became  an 
attractive  alternative  in  its  own  right. 

This  Note  documents  RAND's  research  that  led  to  the  mathematical 
specification  for  the  robust  method.  It  should  be  of  interest  to  the 
Air  Force  members  of  the  EFMP  who  are  building  the  EFMS.  It  should  also 
be  of  interest  to  modelers  and  analysts  who  are  involved  in  manpower  and 
personnel  research  for  the  uniformed  services.  This  specification  was 
presented  to  the  Air  Force  as  one  possible  solution  to  the  problem  of 
predicting  the  short-term  behavior  of  airmen.  The  Air  Force  is  using 
this  and  other  specifications  as  the  point  of  departure  for  developing  a 
method  for  predicting  the  monthly  losses  of  enlisted  personnel  in  Module 
1  of  SAM.  As  a  consequence,  the  version  of  Module  1  that  will  be  used 
in  the  EFMS  is  likely  to  differ  considerably  from  that  presented  in  this 
Note. 


The  work  described  here  is  part  of  the  Enlisted  Force  Management 
Project  (EFMP),  a  joint  effort  of  the  Air  Force  (through  the  Deputy 
Chief  of  Staff  for  Personnel)  and  RAND.  RAND's  work  falls  within  the 
Resource  Management  Program  of  Project  AIR  FORCE.  The  EFMP  is  part  of  a 
larger  body  of  work  in  that  program  concerned  with  the  effective 
utilization  of  human  resources  in  the  Air  Force. 
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SUMMARY 


The  Short-Term  Aggregate  Inventory  Projection  Model  (SAM)  is  one 
component  of  the  Enlisted  Force  Management  System  (EFMS).  SAM  makes 
monthly  projections  (for  the  rest  of  the  current  fiscal  year)  of  the 
aggregate  force  (the  total  enlisted  force  across  all  specialties).  SAM 
can  be  used  to  analyze  the  total  size,  grade  composition,  and  budget 
cost  of  the  enlisted  force  during  a  fiscal  year.  It  supports  planning 
of  management  actions  to  achieve  user-specified  end-of-year  force  levels 
(known  as  "end  strengths")  and  user-specified  end-of-year  grade  levels 
(known  as  "grade  strengths") . 

The  SAM  model  contains  five  modules: 


Module  P 
Module  1 
Module  2 
Module  3 
Module  4 


Preprocessor 
Separation  Projection 
Inventory  and  Cost  Projection 
Computer  Aided  Design 
Plan  Comparison 


Module  1  (the  Separation  Projection  module)  predicts  "policy-free" 
monthly  losses  and  reenlistments  of  Air  Force  enlisted  personnel  for  the 
rest  of  the  current  fiscal  year.  "Policy-free"  means  that  the 
predictions  assume  zero  early  releases  and  zero  early  reenlistments 
caused  by  actions  of  enlisted  force  managers.  The  robust  separation 
projection  method  is  one  way  of  predicting  the  separations  required  from 
Module  1 . 

The  predictions  are  inputs  to  Module  2  of  SAM,  which  adds  the 
effects  of  early  release  and  early  reenlistment  programs  (and  other 
management  actions)  to  convert  the  predictions  of  policy-free  losses  and 
reenlistments  into  predictions  of  actual  losses  and  reenlistments.  The 
robust  separation  projection  method  uses  data  on  past  losses  and 
reenlistments  to  estimate  separation  rates  for  a  model  that  predicts 
policy-free  loss  and  reenlistment  flows  one  month  at  a  time  for  each  of 
a  mutually  exclusive  set  of  about  500  cohorts.  After  these  flows  are 
predicted  for  a  projection  month,  the  inventory  is  updated  and  the 
models  are  applied  to  the  updated  inventories  to  predict  the  flows  for 
the  following  month.  This  process  is  repeated  until  the  inventory  for 
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the  last  month  of  the  fiscal  year  is  projected.  Thus,  it  applies  a 
series  of  separation  rates  to  different  inventories. 
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ACRONYMS  AND  ABBREVIATIONS 


Air  Force  Specialty  Code 

Auto-Regressive  Integrated  Moving  Average  (type  of 
time-series  model) 

Category  of  enlistment  (first-term,  second-term, 
career-term,  retirement  eligible) 

Category  of  enlistment  (same  as  CAT) 

Date  of  current  enlistment --year ,  month 
Date  of  separation--year .month 
Enlisted  Force  Management  System 
Expiration  of  term-of-service 
Expiration  of  term-of-service--year ,  month 
Fiscal  year 
Pay  grade 

Inventory  at  beginning  of  month 

Inventory  Projection  Model 

Attrition  loss  indicator 

ETS  loss  indicator 

Months  to  end  of  term  of  service 

Month  in  term 

Month  of  service 

Promotion/Demotion  Gain  Loss  (file) 

Reenlistment  indicator 
Seasonal  Adjustment  Bell  Labs 

Short-term  Aggregate  Inventory  Projection  Model 
Module  in  SAM  that  estimates  policy-free  separations  and 
performs  policy-free  inventory  projections 
Separation  Program  Designator 

General  category  of  transaction  (loss,  reenlistment,  etc.) 
Social  Security  Number 

Date  of  total  active  federal  military  service--year , 
month ,  day 

Date  of  total  active  federal  military  service--year ,  month 
Term  of  enlistment  (number  of  years  (4  or  6)  of  enlisted 
obligation) 

Term  of  enlistment  (same  as  TOE) 

Uniform  Airman  Record  (file) 

United  States  Air  Force 

Extension  status  (yes  or  no,  short  or  long) 

Years  of  service 

Date  of  the  file--year,  month 
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I.  INTRODUCTION 


The  Short-term  Aggregate  Inventory  Projection  Model  (SAM)  is  the 
component  of  the  Air  Force  Enlisted  Force  Management  System  (EFMS)  that 
provides  one-  to  twelve-month  projections  for  the  aggregate  force 
(across  all  specialties).  It  will  be  used  to  analyze  the  size,  grade 
composition,  and  cost  of  the  enlisted  force  during  a  fiscal  year  and 
supports  the  planning  of  management  actions  designed  to  achieve  fiscal- 
year  goals  for  total  force  strength,  force  strength  by  the  top  five 
grades,  and  personnel  costs. 

SAM  consists  of  five  modules: 

•  SAMP- -data  preparation  preprocessor. 

•  SAMl--separation  and  inventory  projection. 

•  SAM2-- inventory  and  cost  projection. 

•  SAM3- -computer-aided  design  of  management  actions. 

•  SAM4--plan  comparison. 

This  hote  describes  Module  1  of  SAM  (SAMI).  Rydell  and  Lawson 
(1991a)  provide  an  overview  of  SAM  and  detailed  descriptions  of  the 
other  four  modules . 

PURPOSE  OF  SAMI 

SAMI  forecasts  flows  of  enlisted  airmen.  For  each  month,  it 
estimates  how  many  airmen  reenlist,  are  lost,  or  simply  continue  in 
their  terms.  It  divides  losses  into  two  types:  attrition  (not 
fulfilling  contractual  commitments),  and  expiration  of  term-of-service 
(ETS)  losses  (fulfilling  contractual  commitments). 

SAMI  tracks  inventories,  losses,  and  reenlistments,  by  grade.  It 
generates  "baseline"  forecasts  of  behavioral,  as  opposed  to 
policy-driven,  airman  decisions.  If  special  programs  are  implemented  to 
drive  airmen  out  of  the  service  early,  the  data  input  to  SAMI  are 
adjusted  to  reflect  loss  behavior  as  if  the  policy  had  not  been  in 
place,  and  the  module  works  off  the  adjusted  data. 


The  Air  Force  needs  such  a  model  to  carry  out  force  planning. 
Congress  mandates  the  number  of  airmen  and  their  levels  as  of  the  end  of 
the  fiscal  year  (September  30).  Missing  those  targets  in  either 
direction  is  costly:  Budgets  may  be  overrun  or  end-strength  may  be 
insufficient  to  carry  out  the  Air  Force's  mission. 

SUPPORTING  RESEARCH 

SAMI  implements  ideas  that  developed  at  RAND  over  a  five-year 
period  beginning  around  1982,  including  several  specific  forecasting 
models,  plus  the  framework  for  chaining  them  together.  Much  of  the 
structure  of  SAMI  is  the  result  of  the  knowledge  gained  from  fitting 
those  models . 

The  initial  set  of  forecasting  models  was  developed  using  a 
methodology  developed  by  Box  and  Jenkins  (1970).  These  models  use  a 
mutually  exclusive  list  of  about  500  airman  classes  and  predict  for  each 
class  what  fraction  of  airmen  will  be  lost  or  will  reenlist  in  each 
future  month.  Thus,  the  models  move  the  airman  classes  ahead  one  month 
at  at  time.  The  models  implicitly  specify  rules  for  who  moves  ahead  to 
where;  e.g. ,  46  or  more  months  into  the  first  term,  an  airman  is 
eligible  to  reenlist,  move  ahead  to  month  47,  in  certain  circumstances 
fulfill  his  or  her  contractual  obligations,  or  attrit.  The  functional 
forms  of  the  models  vary  considerably  among  classes.  There  is  a  diversi 
mixture  of  autoregressive  models  and  moving  average  models. 

The  Box- Jenkins  models  are  quite  complex,  requiring  great  effort  to 
maintain.  SAMI  should  produce  accurate  forecasts  and  should  be 
maintainable  with  as  little  effort  as  possible.  So  alternative 
forecasting  models  were  considered  with  the  intent  of  contrasting  them 
on  maintainability  as  well  as  performance. 

Autoregressive  models  are  really  conditional  expectation  models: 
Known  past  information  is  used  to  forecast  average  future  information. 

In  the  simplest  case,  take  the  average  of  some  of  the  past  data  as  the 
forecast.  This  would  smooth  fluctuations  in  the  data  and  yield  an 
estimate  of  future  values.  How  much  of  past  data  should  be  used  to 
calculate  the  average?  Should  all  past  data  have  equal  weight?  Maybe 
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data  from  the  distant  past  is  not  as  relevant  as  more  recent  data. 
Exponential  smoothing  is  a  forecasting  technique  that  uses  continually 
decreasing  weights  to  average  the  data  from  the  present  into  the  past. 

If  the  coefficients  of  the  forecast  decrease  very  slowly,  then  large 
amounts  of  past  data  contribute  to  the  forecast  and  the  exponential 
smoothing  forecast  is  almost  equivalent  to  a  simple  running  average.  If 
they  decrease  quickly,  then  the  forecast  is  determined  almost 
exclusively  by  recent  experience. 

The  main  problem  with  averages  is  that  they  are  greatly  influenced 
by  extreme  values.  A  very  large  past  value  of  the  data  will  increase 
the  average,  thus  increasing  the  forecast  of  the  future.  When  the  data 
fluctuate  widely,  the  median  or  middle  value  is  often  used  instead  of 
the  average  because  it  is  less  influenced  by  either  large  or  small 
outliers.  This  observation  leads  to  a  class  of  forecasting  models 
called  robust  models,  which  use  well-known  methods  of  robust  linear 
regression  and  medians  to  extract  trend  and  seasonal  effects  from  each 
series  in  ways  that  are  not  sensitive  to  outliers. 

Box-Jenkins  models,  running  average  models,  and  robust  models 
provide  three  independent  ways  for  SAMI  to  produce  its  estimates.  The 
Air  Force  is  conducting  an  extensive  test  and  evaluation  to  determine 
which  type  of  model  it  will  use  in  the  EFMS.  Documentation  for  the 
Box-Jenkins  models  can  be  found  in  Brauner,  Lawson,  and  Mickelson 
(1991).  Running  average  models  are  the  basis  of  the  Benchmark 
Separation  Projection  model,  documented  by  Rydell  and  Lawson  (1991b). 
This  Note  documents  the  robust  models. 

OUTPUTS  FROM  SAMI 

SAMI  projects  attrition,  policy-free  ETS  losses,  retirements, 
reenlistments,  and  flows  to  retirement  eligibility  up  to  12  months  into 
the  future.  It  starts  with  actual  inventory  counts  in  each  of  about  500 
airman  classes;  then,  for  each  month,  it  determines  the  number  of  each 
type  of  transition  from  within  each  class. 
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The  classes  of  airmen  are  defined  by  the  following  attributes: 

•  CAT--category  of  enlistment  (first  term,  second  term,  career, 
retirement  eligible). 

•  TOE--term  of  enlistment  (4  or  6  years). 

•  MOS--month  of  service  (1,  2,  3,  ...). 

•  METS — months  to  ETS  (48,  47,  ...,  0,  -1,  ..). 

•  MIT- -month  in  term  (1,  2,  — ). 

•  XLEN- -extension  status  (yes  or  no,  short  or  long). 

•  Y0S--years  of  service. 

Transitions  can  be  one  of  four  types: 

•  Loss  to  attrition. 

•  Loss  to  expiration  of  term  of  service. 

•  Reenlistment. 

•  Simple  aging  into  the  next  class. 

Given  these  transition  counts,  SAMI  updates  the  size  and 
composition  of  the  airman  classes,  summarizes  certain  features  of  that 
month's  transitions,  then  moves  on  to  the  next  month. 

Output  from  SAMI  becomes  input  to  SAM2,  which  projects  monthly 
inventories  and  fiscal-year  costs  conditional  upon  user  choices  of 
management  actions  (such  as  early  releases)  that  control  the  shape  of 
the  enlisted  force  over  time. 

ORGANIZATION 

Section  II  describes  the  types  of  databases  that  supported  the 
development  and  testing  of  SAMI,  what  was  done  with  these  data,  and  how 
they  guided  the  development  of  the  module.  Section  III  describes  how 
SAMI  works.  In  addition  to  airman  counts,  input  to  SAMI  includes  a  set 
of  loss  and  reenlistment  models.  Section  IV  describes  the  robust 
models.  Results  from  testing  the  robust  models  are  discussed  in  Sec.  V. 
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II.  DATA  FOR  FITTING  AND  TESTING 


A  dataset  was  needed  on  which  SAMI  could  be  tested  and  debugged. 
RAND  did  not  have  the  knowledge  to  build  the  final  working  dataset,  nor 
did  it  have  the  responsibility  of  keeping  it  current  in  day-to-day 
operations.  For  these  reasons,  RAND  built  a  test  dataset  with  enough 
features  to  support  implementation,  testing,  and  development.  The  Air 
Force  has  prepared  the  dataset  for  the  operational  model. 

INFORMATION  SOURCES 

Both  the  test  dataset  and  the  Air  Force  dataset  were  constructed 
with  data  from  two  monthly  airman- level  files  maintained  by  the  Air 
Force:  the  "Uniform  Airman  Record"  (UAR)  file,  and  the  "Promotion, 
Demotion,  Gain,  Loss”  (PDGL)  file.  The  UAR  contains  inventory 
information  at  the  end  of  the  month,  and  the  PDGL  contains  information 
on  transactions  that  occurred  during  the  month.  With  one  record  for 
every  airman  in  the  force,  the  UAR  contains  about  500,000  records  per 
month;  the  PDGL  contains  about  30,000  records  per  month,  with  sometimes 
more  than  one  record  per  airman  per  month.  These  data  were  available  to 
us  for  the  months  from  February  1983  through  September  1987. 

Tables  1  and  2  list  the  relevant  variables  available  from  each 
source.  Each  record  contains  a  certain  amount  of  demographic 
information  (e.g.,  whether  the  airman  finished  high  school,  race,  age, 
sex),  plus  information  describing  the  airman's  status  in  the  force.  All 
of  the  variables  listed  in  the  tables  were  needed  to  classify  airmen 
into  the  modeling  categories. 

DATA  PROCESSING  REQUIREMENTS 

Unpublished  RAND  research  on  the  Enlisted  Force  Management  Project 
by  Joseph  Adams  and  Jan  Chaiken  had  identified  homogeneous  groups  of 
airmen  within  which  fairly  constant  loss  and  reenlistment  behavior  can 
be  expected.  Table  3  shows  the  variables  required  to  produce  these 
groupings,  along  with  the  variables  to  be  aggregated. 
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Table  1 

UAR  VARIABLES  USED  TO  CREATE  DATASET  FOR  SAMI 


Variable 


Description 


CATENLST 


DOSYRMO 

DOEYRMO 


ETSYRMO 

GRADE 

SSAN 

TAFMSDYM 

TERMENLT 


Category  of  enlistment  codes: 

1  =  first-term  airman 

2  =  second-term  airman 

4  -  career  airman 

5  =  E-9  or  E-9  selectee  with  high-year  of  tenure  waived 
blank  or  9  =  unknown 

Date  of  separation--year .month 
Example:  870/ 

Date  of  current  enlistment--year ,  month 

For  first-term  airmen,  DOEYRMO  usually  =  TAFMSDYM. 

For  second-  and  career- term  airmen,  DOEYRMO  is  the 
date  the  current  term  began. 

Expiration  of  term  of  service — year,  month 

Pay  grade 

Social  Security  number 

Date  of  Total  Active  Federal  Military  Service-- 
year,  month.  The  date  the  airman  entered  U.S. 
military  service  (not  necessarily  the  Air  Force) . 

Term  of  enlistment 

The  number  of  years  for  which  an  individual 
voluntarily  enters  into  a  USAF  component. 


YRMO 


Date  of  the  file--year,  month 
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Table  2 

PDGL  VARIABLES  USED  TO  CREATE  DATASET  FOR  SAMI 


Variable 


Description 


CATENLST 


GRADE 

SSAN 

SPDTRCD 


TAFMSD 

TERMENLT 

YRMO 


Category  of  enlistment  code 

1  =  first-term  airman 

2  =  second-term  airman 

4  =  career  airman 

5  =  E-9  or  E-9  selectee  with  high-year  of  tenure  waived 
blank  or  9  =  unknown 


Pay  grade 

Social  Security  number 


This  variable  identifies  the  general  category  of  the 
transaction  (gain,  loss,  reenlistment,  or  extension) 
and  specific  type  of  transaction  within  each 
category.  The  general  groupings  are 


010 

020 

030 

040-055 

100-160 

170 

200 

210 

300-310 

400 

410,600-610 

500-520,645-655 

615-625 

630-640 

700-840 

other 


non -prior  service  accession 

prior  service  accession 

gain  for  officer  training  school 

miscellaneous  gain 

reenlistment 

extension 

promotion 

demotion 

retirement  loss 

loss  to  officer  training  school 

miscellaneous  loss 

expiration  of  term-of-service  loss 

palace  chase  loss 

early  release  loss 

attrition  loss 

unknown 


Date  of  Total  Active  Federal  Military  Service 
year , month, day 

Term  of  enlistment 

The  number  of  years  for  which  an  individual 
voluntarily  enters  into  a  USAF  component. 

Date  of  the  file — year,  month 


Grouping  Variables 


GRADE  Pay  grade- -taken  as  the  GRADE  on  the  UAR  or  PDGL 

CAT  Category  of  enlistment--computed  from  CATENLST  on  the 

UAR  or  PDGL 

1  =  first-term  airman 

2  =  second-term  airman 

3  =  career  airman 

4  =  retirement  eligible 

TOE  Term  of  enlistment--taken  as  TERMENLT  on  the  UAR  or 

PDGL 

MOS  Month  of  service- -computed  as  the  difference  between 

now  and  the  date  of  total  active  military  service 
(TAFMSDYM  or  TAFMSD) 

METS  Months  to  ETS- -difference  between  now  and  ETSYRMO 

MIT  Months  in  term  (first  term  only) --computed  as  a 

function  of  TOE  and  METS 

XLEN  Extension  length  (first  term  only) 

0  =  currently  on  a  <12  month  extension 

1  =  currently  on  a  £12  month  extension 
-99  =  not  currently  on  extension 

Aggregation  Variables 

INV  In  inventory  at  beginning  of  month- -present  on  the 

UAR  now,  or  present  on  the  UAR  the  previous  month 

LATR  Attrition  loss  indicator--recoded  from  transaction 

category  variable  SPDTRCD  (on  the  PDGL) 

LETS  ETS  loss  indicator- -recoded  from  transaction  category 

variable  SPDTRCD  (on  the  PDGL) 

REUP  Reenlistment  indicator --recoded  from  transaction 

category  variable  SPDTRCD  (on  the  PDGL) 
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To  satisfy  the  requirements  of  SAMI,  it  was  not  sufficient  simply 
to  build  airman-month  level  variables  and  do  a  crosstabulation.  First, 
policy  effects  had  to  be  removed  from  the  data.  During  certain  recent 
time  periods,  select  groups  (e.g.,  groups  approaching  their  expiration 
of  term  of  service)  had  been  singled  out  for  early-release  programs  at 
different  times.  Because  SAMI  makes  baseline  projections  (projections 
assuming  no  policy  intervention) ,  it  is  necessary  to  remove  these 
program  effects  from  the  dataset.  Special  codes  in  the  PDGL  file 
indicate  who  left  because  of  early-release  programs:  The  data  were 
modified  to  pretend  that  these  airman  were  in  the  force  until  their 
originally  scheduled  ETS  date.  It  was  therefore  necessary  to  link  an 
airman's  records  across  time,  then  work  through  his  longitudinal  history 
to  modify  his  records.  This  added  greatly  to  the  complexity  of  the  data 
recoding  algorithms.  It  also  greatly  increased  the  amount  of  data 
processing:  Instead  of  passing  each  monthly  file  individually,  the  data 
for  all  months  had  to  be  sorted  and  merged  at  the  airman  level. 

Errors  in  the  data  posed  additional  problems.  The  UAR  and  PDGL 
files  are  known  to  have  several  unedited  fields,  which  would  require  a 
fair  amount  of  cleaning  to  correct.  The  files  are  created  to  produce 
simple  monthly  reports,  and  these  reports  (or  the  use  to  which  they  are 
put)  are  not  sensitive  to  occasional  errors.  SAMI,  however,  required 
cleaner  files  than  that.  Errors  in  dates  or  enlistment  categories 
caused  irreconcilable  counts  from  month  to  month.  For  example,  if 
errors  in  one  month  produced  an  overcount  that  was  corrected  by  the  next 
month,  it  was  not  possible  to  discern  why  the  counts  changed.  Was  it 
unexpected  losses  or  correction  of  errors?  The  data  contained  numerous 
stray  codes  that  required  Air  Force  personnel  expertise  to  resolve. 
RAND's  strategy  was  to  rely  on  the  fact  that  errors  in  data  items  tend 
to  be  corrected  the  following  month.  When  an  airman's  entire 
longitudinal  history  was  input,  valid  data  could  be  identified  by 
sweeping  through  all  months  and  accepting  values  that  were  consistent 
over  time. 
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The  data  processing  algorithms  were  developed  through  a  long  series 
of  iterations.  The  first  iteration  derived  airman  characteristics  and 
reviewed  many  airmen  on  an  individual  basis.  Subsequent  iterations 
attempted  to  correct  identified  problems,  verify  their  resolution,  and 
then  produce  additional  airman  records  to  see  what  other  problems 
remained.  The  goal  was  to  achieve  internal  consistency:  UAR  and  PDGL 
records  tended  to  have  numerous  inconsistencies,  but  it  was  unlikely 
that  the  same  inconsistency  would  persist  for  a  given  airman  over  time 
(e.g.,  three  consecutive  values  of  category  of  enlistment  might  be 
(4,2,4),  in  which  case  the  2  would  be  changed  to  a  4). 

The  process  ultimately  converged,  and  a  dataset  was  built  upon 
which  many  of  the  final  modeling  decisions  were  based.  These  files  have 
been  superseded  by  files  built  by  the  Air  Force. 


11 


III.  STRUCTURE  OF  SAMI 


SAMI  is  implemented  in  a  FORTRAN  program.  The  program  moves  each 
group  of  airmen  forward  one  month  at  a  time.  At  each  time  point,  some 
fraction  of  the  group  is  lost,  some  fraction  reenlists,  and  the  rest  of 
the  group  is  aged.  The  model  has  a  Markovian  flavor  in  the  sense  that, 
given  the  transition  probabilities,  the  number  of  airmen  in  a  given 
state  at  time  t+1  depends  only  on  the  inventory  at  time  t.  However,  the 
transition  probabilities  at  each  time  depend  on  more  than  just  the  most 
recent  observations,  so  the  model  is  not  strictly  Markovian. 

MODELING  ENVIRONMENT 

Several  considerations  guided  development  of  SAMI.  First,  RAND 
research  had  identified  homogeneous  groups  of  airmen  within  which  fairly 
constant  loss  and  reenlistment  behavior  was  expected.  Also,  SAMI' s 
output  had  to  satisfy  explicit  requirements.  Additional  modules  of  SAM 
had  already  been  designed  to  display,  aggregate,  edit,  and  further 
analyze  SAMI' s  output.  These  modules  had  been  designed  to  supply  Air 
Force  personnel  managers  with  the  information  they  wanted  and  needed. 
SAMI  was  also  expected  to  provide  inputs  to  a  Middle-Term  Disaggregate 
Inventory  Projection  Model:1  This  specified  a  different  level  of 
detail.  Finally,  the  intention  to  validate  the  models  on  data  that  had 
not  been  used  in  the  models'  development  implied  that  the  models  could 
change,  so  there  was  a  need  not  to  hard-wire  specific  models  into  SAMI, 
but  to  allow  change. 

In  view  of  these  considerations,  several  design  decisions  were  made 
at  an  early  date. 

*  Choices  of  homogeneous  groups  were  made,  dependent  on 


'Unpublished  RAND  research  by  Joseph  Cafarella,  Grace  Carter,  Jan 
Eakle-Cardinal,  Robert  Houchens,  C.  Peter  Rydell,  and  Warren  Walker. 
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•  CAT--Category  of  enlistment  (first-term,  second-term, 
career-term,  retirement  eligible). 

TOE--Term  of  enlistment  (4  or  6  years). 

-  MQS--Month  of  service  (first  and  retirement  terms  only). 
METS- -Months  to  ETS . 

MIT- -Month  in  term. 

XLEN- -Extension  status. 

-  YOS--Years  of  service. 

•  The  time  interval  for  projection  was  taken  to  be  one  month.  No 
limit  was  imposed  on  the  number  of  months  SAMI  might  forecast 
over.  That  would  be  an  input  to  the  program. 

•  The  time  period  for  model  fitting  (FY74-FY83)  was  kept  separate 
from  the  time  period  for  testing  (FY84  and  beyond). 

•  The  model  had  to  run  easily  on  an  IBM  4381  computer  (the  EFMS 
computer).  Execution  time  to  project  12  months  could  be  no 
more  than  2  hours,  and  the  model  would  have  to  fit  within  about 
8  megabytes  of  memory. 

•  SAMI  had  to  be  easily  modified  to  permit  testing  different 
types  of  models.  The  Box- Jenkins  forecasting  models  contained 
many  parameters  and  would  require  a  great  deal  of  effort  to 
maintain.  The  plan  was  to  test  some  simpler  models,  such  as 
running  average  models,  to  see  how  much  (if  any)  precision  was 
gained  by  the  additional  complexity. 

•  The  data  examined  were  not  stable.  Plots  of  various  series 
showed  abrupt  shifts  in  loss  and  reenlistment  rates.  SAMI  had 
to  be  designed  to  operate  in  an  environment  where  such  shifts, 
whether  due  to  policy  changes  or  to  changes  in  the  nature  of 
available  data,  were  an  expected  phenomenon. 

•  Air  Force  policies  keep  changing.  For  example,  ETS  losses 
could  occur  anywhere  within  a  year  of  ETS  for  the  entire  period 
when  the  modeling  occurred,  whereas  a  recent  decision  allows 
them  only  during  the  last  three  months  of  that  year.  SAMI  had 
to  be  designed  to  produce  reasonable  projections  in  the  face  of 
such  changes. 
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LOGIC  OF  SAMI 

SAMI  requires 

•  A  set  of  rules  for  mapping  grouping  variables  into  homogeneous 
groups  known  as  cohorts. 

•  A  set  of  rules  for  aging  cohorts  over  their  Air  Force  careers. 

•  Recent  counts  of  inventory,  losses  due  to  attrition,  ETS 
losses,  and  reenlistments,  by  grade. 

•  A  set  of  models  for  estimating  loss  and  reenlistment  rates. 

SAMI  takes  each  cohort  and  ages  it  one  month,  using  the  loss  rates 
and  reenlistment  rates  provided  by  the  models.  After  SAMI  cycles 
through  the  entire  set  of  cohort  indices  for  a  given  month,  the 
characteristics  of  the  cohorts  are  updated  (MOS  is  increased  by  1,  METS 
is  decreased  by  1,  reenlistments  are  sent  into  the  next  category  of 
enlistment,  etc.).  Finally,  certain  statistics  summarizing  that  month 
are  generated,  and  SAMI  moves  on  to  the  next  month. 

Figures  1  and  2  show  the  types  of  transitions  that  airmen  can  make 
as  they  move  through  the  force.  For  simplicity,  the  figures  consider 
only  4-year  terms  of  enlistment;  nevertheless,  they  show  about  200 
states  in  the  first,  second,  and  career  terms,  and  about  150  states  for 
the  latter  part  of  the  career  term  and  the  retirement  eligible  years. 

Airmen  enter  from  the  civilian  labor  force,  and  progress  through 
their  first  term,  occupying  each  state  for  one  month.  At  any  point, 
they  can  move  forward  in  that  term,  or  they  can  reenter  the  civilian 
labor  force  through  attrition.  At  a  certain  point  in  the  term,  the 
number  of  choices  increases  by  two:  Airmen  can  reenlist,  or  they  can 
fulfill  their  contractual  obligations  and  become  ETS  losses.  If  they 
reenlist,  they  follow  a  similar  path  in  the  second  and  career  terms. 

The  complete  set  of  cohort  definitions  allowed  is  shown  in  Table  4. 
Each  combination  of  CAT,  TOE,  MOS,  METS,  MIT,  and  XLEN  is  crossed  with 
all  applicable  YOS  values.  While  about  420  combinations  of  categories 
are  indicated  in  the  table,  crossing  the  categories  with  YOS  yields 
about  1,000  combinations. 
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Table  4 

AIRMAN  COHORTS  USED  IN  SAMI 


CAT  TOE 

MOS 

METS 

MIT 

XLEN 

YOS 

CAT  TOE 

MOS 

METS 

MIT 

XLEN 

YOS 

r- 

— l 

r~ 

“l 

I  1 

4 

-99 

48 

1 

-99 

o  1 

1  3 

4 

-99 

-99 

-99 

-99 

alii 

1  1 

4 

-99 

47 

2 

-99 

o  1 

1  3 

4 

-99 

12 

-99 

-99 

allj 

#  . 

1  3 

4 

-99 

11 

-99 

-99 

allj 

|  1 

4 

-99 

13 

36 

-99 

2  I 

1 

•  . 

.  . 

allj 

1  1 

4 

-99 

12 

37 

-99 

3  j 

i  3 

4 

-99 

<-11 

-99 

-99 

allj 

j  1 

4 

-99 

12 

37 

0 

3  I 

1 _ 

_j 

|  1 

4 

-99 

12 

37 

1 

3  I 

l 

|  1 

4 

-99 

11 

38 

-99 

3  | 

1  3 

4 

229 

-99 

-99 

-99 

all  | 

1 

,  . 

.  . 

,  , 

1 

1  3 

4 

230 

-99 

-99 

-99 

allj 

|  1 

4 

-99 

<-22 

72 

-99 

5  | 

1  3 

4 

•  •  • 

-99 

-99 

-99 

allj 

j  l 

4 

-99 

<-22 

72 

0 

5  | 

1  3 

4 

239 

-99 

-99 

-99 

allj 

|  1 

4 

-99 

<-22 

72 

1 

5  | 

1  3 

4 

240 

-99 

-99 

-99 

allj 

_J 

1 _ 

_J 

i- 

—\ 

1 

|  1 

6 

-99 

72 

1 

-99 

o  1 

1  3 

6 

-99 

-99 

-99 

-99 

all  | 

|  1 

6 

-99 

71 

2 

-99 

o  1 

1  3 

6 

-99 

12 

-99 

-99 

allj 

1 

•  . 

.  . 

1 

1  3 

6 

-99 

11 

-99 

-99 

all  | 

|  1 

6 

-99 

13 

60 

-99 

4  | 

1 

,  , 

,  , 

allj 

j  1 

6 

-99 

12 

61 

-99 

5  1 

1  3 

6 

-99 

<-11 

-99 

-99 

allj 

j  1 

6 

-99 

12 

61 

0 

5  j 

1 _ 

_i 

j  1 

6 

-99 

12 

61 

1 

5  | 

1 

— i 

|  1 

6 

-99 

11 

62 

-99 

5  j 

1  3 

6 

229 

-99 

-99 

-99 

all  | 

1 

.  . 

•  . 

•  . 

1 

1  3 

6 

230 

-99 

-99 

-99 

allj 

|  1 

6 

-99 

<-22 

96 

-99 

7  | 

1  3 

6 

231 

-99 

-99 

-99 

allj 

j  1 

6 

-99 

<-22 

96 

0 

7  j 

1  3 

6 

•  •  • 

-99 

-99 

-99 

allj 

|  1 

6 

-99 

<-22 

96 

1 

7  j 

1  3 

6 

237 

-99 

-99 

-99 

allj 

i _ 

_J 

1  3 

6 

238 

-99 

-99 

-99 

all! 

i 

“I 

1  3 

6 

239 

-99 

-99 

-99 

all  | 

j  2 

4 

-99 

-99 

-99 

-99 

all| 

i  3 

6 

240 

-99 

-99 

-99 

allj 

|  2 

4 

-99 

15 

-99 

-99 

all  j 

L_ 

j  2 

4 

-99 

.  . 

-99 

-99 

all) 

I- 

~i 

|  2 

4 

-99 

<-11 

-99 

-99 

all  j 

1  4 

-99 

241 

-99 

-99 

-99 

all  | 

i _ 

_ i 

1  4 

-99 

242 

-99 

-99 

-99 

allj 

l 

— i 

1 

■  .  • 

allj 
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NOTES:  CAT  =  3  indicates  career  term,  4  indicates  retirement  eligible. 
CAT  =  -99  indicates  category  not  used  to  define  the  cohort. 
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AIRMAN  COUNTS  AND  TRANSITION  RATES 

SAMI  needs  inventory  counts  to  know  how  many  airmen  to  project 
forward.  If,  in  addition,  the  transition  probabilities  were  known  for 
flows  between  states,  it  would  be  possible  to  predict  the  size  of  the 
force  perfectly.  It  is  these  transition  probabilities  that  have  to  be 
estimated. 

Section  II  described  how  the  airman  inventory,  loss,  and 
reenlistment  counts  were  obtained.  These  counts  are  essentially 
crosstabulations  of  airmen  by  grade  versus  the  above  combinations  of 
indices.  The  major  modification  to  the  counts  was  an  attempt  to  "put 
back"  those  airmen  who  were  lost  to  early  release  programs  or  required 
to  reenlist  early.  The  inventory  adjustments  assume  these  airmen  are  in 
the  force  until  their  contract  separation  date  and  that  the  appropriate 
ETS  loss  or  reenlistment  occurs  on  that  date.  Even  this  method  is  only 
an  approximation  to  what  would  have  occurred  had  the  early  release 
program  not  been  in  effect.  An  airman  who  was  forced  to  choose  to 
reenlist  or  leave  early  could  have  made  a  different  choice  or  attritted 
if  allowed  to  remain  in  the  Air  Force  until  his  ETS. 

Time  series  methods  were  used  to  estimate  transition  probabilities. 
The  types  of  time  series  formed  are  indicated  in  Fig.  3.  In  this  case, 
the  probabilities  are  those  relating  to  first-term  airmen  in  their  46th 
month  of  service.  Each  airman  position  was  isolated,  and  the  transition 
rates  out  of  that  position  over  the  time  period  FY74  through  FY87  were 
examined.  Figures  4  and  5  show  some  typical  time  series  so  formed. 
Figure  4  is  the  time  series  of  attrition  losses  for  first-term  airmen  in 
their  second  month  of  service.  Figure  5  is  the  time  series  of  attrition 
losses  for  first-term  airmen  in  their  third  month  of  service.  The 
former  series  seems  to  be  fairly  stable,  but  the  latter  contains  a  shift 
in  average  behavior  in  FY84.  Time  series  like  these  form  the  basis  of 
the  modeling  activity,  as  described  below. 
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Fig.  3--Time  secies  formed  for  predicting  transition  probability 
for  1st  term  airmen  in  month  of  service  36 
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Fig.  4--Raw  data:  Losses  due  to  attrition, 
1st  term  airmen  in  month  of  service  2 


Fig.  5--Raw  data:  Losses  due  to  attrition 
1st  term  airmen  in  month  of  service  3 
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IV.  THE  ROBUST  MODELS 


The  approach  uses  robust  methods  of  statistics  to  decompose  a 


series  as 


x  =  m  +  s  +  r 
t  t  t  t 


where 


=  the  loss /reenlistment  rate  at  time  t. 
m^  =  the  trend. 
st  =  the  seasonal  effect, 
r  =  the  residual  component. 

It  operates  by  subjecting  the  series  to  several  filters,  each  of 
which  operates  on  a  moving  window  of  points.  The  filters  are  robust  in 
the  sense  that  they  are  not  greatly  affected  by  one  or  two  outliers. 

The  robust  method  consists  of  the  following  nine  steps: 


1.  Smooth  the  data  with  12-month  moving  medians.  The  12-month 
window  is  wide  enough  to  avoid  seasonal  effects,  and  the 
medians  are  insensitive  to  outliers. 

2.  Smooth  the  moving  medians  with  moving  averages.  Because  the 
effects  of  outliers  were  eliminated  through  the  moving  medians, 
using  moving  averages  will  not  cause  a  problem  here.  These  two 
fits  have  eliminated  12  points  from  each  end;  these  are  added 
back  in  Step  8. 

3.  Compute  the  residuals  of  the  raw  data  with  respect  to  the 
moving  average  fit  from  Step  2. 

4.  Group  these  residuals  by  month  of  year:  Regard  the  January 
residuals  as  their  own  time  series,  similarly  for  the  other 
months . 

5.  Fit  medians  to  each  of  the  12  monthly  series  from  Step  4. 


-  20  - 


6.  Calculate  final  estimates  of  monthly  effects  by  smoothing  these 
medians  using  averages  over  adjacent  months. 

7.  Subtract  these  monthly  effects  from  the  original  series;  this 
presumably  deseasonalizes  the  data. 

8.  Regress  the  deseasonalized  data  on  time  (using  robust 
regression  methods)  and  use  predicted  values  to  extend  the 
deseasonalized  series  forward  and  backward  12  months.  This 
produces  a  deseasonalized  series  over  the  same  time  frame  as 
the  original  series.  Robust  regression  methods  downweight 
outlying  values  to  guard  against  their  distorting  the  fits: 
Compare  Cleveland,  1979. 

9.  Assume  for  projection  purposes  that  recent  slopes  in  trends 
will  flatten  out.1  Thus,  project  the  last  fitted  trend  point 
(say,  at  time  T)  forward,  and  add  the  estimated  seasonal 
effects  to  extrapolate  to  the  next  fiscal  year. 

*T+1  =  “T  +  ST-11 
*T+12  =  “T  +  ST 

The  next  section  contains  data  series  for  several  airman  classes 
with  one-year  robust  extrapolations  added  to  their  end.  It  graphically 
shows  the  effects  of  the  algorithm  and  compares  its  performance  with 
those  of  the  other  methods  using  the  test  dataset  constructed  at  RAND. 


1 Indeed,  if  one  looks  at  a  plot  of  loss  or  reenlistment  rates  over 
time,  the  series  trends  tend  to  fluctuate  up  and  down  without 
predictable  cycle  lengths . 
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V.  TEST  AND  EVALUATION  OF  THE  ROBUST  MODELS 


The  performance  of  the  models  was  examined  on  two  levels:  the  micro 
level  (Figs.  6-9),  and  the  aggregate  level  (Tables  5  and  6).  At  the 
micro-level,  the  extrapolated  probabilities  were  checked  for  reasonable 
values  by  simply  looking  at  graphs  of  projections.  At  the  aggregate 
level,  forecast  inventories  one  year  out  were  compared  with  actual 
values. 

MICRO-LEVEL  RESULTS 

The  micro-level  comparisons  focus  on  transition  rates  for  the 
approximately  500  classes  of  airmen.  Figures  6-9  display  actual  data 
(spiked  lines)  and  fitted  trend  (curves)  for  FY84-FY87  for  four  airman 
classes.  Projected  transition  rates  are  shown  in  the  last  panel  for 
FY88  using  robust  models  (labeled  R) ,  the  Box- Jenkins  models  fit  on  data 
from  July  1974  through  June  1983  (labeled  B),  and  3-month  running 
average  models  (labeled  A).  These  four  particular  airman  classes  were 
chosen  because  they  represent  the  range  of  observed  patterns  and 
comparisons. 

Figures  6  and  7  show  attrition  losses  for  first -term  airmen  in 
months  of  service  2  and  3.  The  robust  model  predicts  the  trend  and  the 
seasonality  best  of  the  three  methods.  Figure  8  shows  that  there  was  a 
large  outlier  in  mid-FY87  for  reenlistment  rates.  This  did  not  affect 
the  accuracy  of  the  robust  model  projections  but  would  have  caused  the 
running  average  model  to  forecast  reenlistment  rates  that  were  much  too 
high  toward  the  end  of  FY87.  Figure  9  demonstrates  the  inability  of  the 
Box-Jenkins  models  to  adapt  to  a  change  in  the  level  of  the  transition 
probabilities  between  the  time  period  used  for  fitting  the  models  and 
that  in  which  the  uodels  are  applied.  In  sum,  the  robust  models  look 
fairly  reasonable  and  certainly  appear  best  among  these  three  candidates 
for  these  particular  series.  This  behavior  was  typical  of  other  series 
as  well. 
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Fig.  7--Attrition  loss  rate,  1st  term  airmen  in  month  of  service  3 


Fig.  9--ETS  loss  rate,  1st  term  airmen  in  month  of  service  49 
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AGGREGATE-LEVEL  RESULTS 

The  aggregate- level  results  focus  on  total  inventory  by  category  of 
enlistment.  Other  aggregations  could  be  considered,  such  as  counts  of 
people  by  grade  and  year  of  service.  The  decision  was  made  to 
concentrate  on  category  of  enlistment  aggregations  because  they  would  be 
fairly  free  of  policy  effects  (recall  that  SAMI  tries  to  forecast  in  a 
"policy  frt  environment").  Also,  published  statistics  of  actual  counts 
were  used  for  comparison.  The  robust  model  picks  up  several  unobvious 
trends  that  are  not  simply  straight-line  projections  from  the  previous 
year.  Much  of  the  force  behavior  is  predictable:  The  majority  of 
airmen  simply  age  by  one  month.  The  rates  at  which  they  are  lost  or 
reenlist  are  fairly  stable  over  time,  so  errors  in  predicting  those 
rates  do  not  have  a  major  effect  on  the  aggregate  inventory  projections. 

The  remainder  of  this  section  discusses  the  results  of  tests  of  the 
robust  models  using  a  dataset  provided  to  RAND  by  the  Air  Force  in  April 
1989.  For  each  month  in  the  period  October  1987  through  September  1988, 
inventory,  losses,  and  reenlistments  were  projected  forward,  to  the  end 
of  the  fiscal  year  (FY87  or  FY88).  The  predictions  were  compared  with 
actuals.  The  appendix  contains  the  complete  set  of  actual  and  predicted 
values,  along  with  their  actual  and  percentage  differences.  This 
section  summarizes  the  full  fiscal  year  forecasts  (the  ones  that  used 
October  as  the  start  date)  and  the  half-year  forecasts  (the  ones  that 
used  April  as  the  start  date). 

The  results  of  the  test  are  not  simple  to  interpret.  Ideally, 
comparisons  of  actual  and  predicted  values  should  indicate  random 
variation.  Large  discrepancies  between  the  actual  and  predicted  values 
would  signal  possible  model  misspecification.  But  the  actual  data 
values  are  quite  sensitive  to  policy  actions  that  increase  or  decrease 
loss  and  reenlistment  rates.1  The  test  results  contain  some  of  these 
policy  effects,  and  there  is  no  simple  way  to  disentangle  them  all. 

lThe  policy-free  adjustments  affect  only  the  timing  of  losses.  The 
net  effect  of  the  early  release  programs  is  to  accelerate  (and  perhaps 
increase  or  decrease)  losses. 
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Despite  this,  through  years  of  major  changes  in  the  inventories, 
the  model  stayed  well  within  or  close  to  1  percent  error  for  all 
categories  of  enlistment  with  one  exception,  and  that  exception  can  be 
traced  to  a  policy  effect. 

Percentage  errors  in  predicting  losses  and  reenlistments  are  much 
larger  than  for  inventories.  They  are  generally  within  10  percent.  For 
the  purposes  for  which  SAM  was  built,  producing  accurate  inventory 
projections  is  much  more  important  than  producing  accurate  predictions 
of  losses  and  reenlistments. 


Inventory  Projections 

The  results  of  inventory  projection  are  shown  in  Table  5.  Under 
the  "actual"  column,  the  inventory  at  the  end  of  the  fiscal  year  is 
shown.  Then  there  are  two  alternative  predictions  of  that  end-of-year 
inventory:  SAMl's  prediction  for  that  entire  year  (M-l)  and  SAMl's 
prediction  for  the  last  half  of  the  year  (M-l/2)  given  the  actual  data 
for  the  first  half  of  the  year.  The  percentage  error  (two  columns  on 
the  right)  tell  the  main  story. 


Table  5 

END-OF-FISCAL-YEAR  INVENTORY 


CATENLST 

Fiscal 

Year 

Actual 

Inventory 

Projected  Inventory 

M-l  M-l/2 

Percentage  Error 

M-l  M-l/2 

all 

1987 

95640 

494487 

496480 

-.2 

.2 

all 

1988 

481117 

482205 

481633 

.2 

.1 

1st 

1987 

220501 

221950 

221545 

.7 

.5 

1st 

1988 

201189 

202547 

200560 

.7 

-.3 

2d 

1987 

118380 

116748 

118414 

-1.4 

.0 

2d 

1988 

118613 

117796 

118129 

-.7 

-.4 

career 

1987 

134736 

133671 

134416 

-.8 

-.2 

career 

1988 

138692 

138244 

139585 

-.3 

.6 

retirement 

1987 

22023 

22117 

22105 

.4 

.4 

retirement 

1988 

22623 

23617 

23359 

4.4 

3.3 
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Except  for  the  retirement  term  in  FY88,  SAMI  forecasts  have  small 
percentage  errors  across  the  board,  despite  fairly  large  changes  in  the 
inventories  from  one  year  to  the  next.  The  FY88  discrepancy  can  be 
traced  to  exceptionally  high  retirement  losses  during  the  last  two 
months  of  that  fiscal  year.  During  that  period,  early  retirement  was 
encouraged  through  waiver  of  commitments.  An  airman  could  retire  early 
in  his  current  grade  and  receive  credit  for  having  completed  his 
obligation  in  that  grade. 

Reenlistment  and  Loss  Projections 

Table  6  shows  how  SAMI  performed  in  estimating  counts  of  each  of 
the  three  kinds  of  transitions:  attrition  losses  (attr),  ETS  losses 
(ets),  and  reenlistments  (reup).  Cases  in  which  the  errors  are  larger 
than  10  percent  are  flagged  and  discussed  in  the  footnotes. 

To  understand  SAMl's  predictive  ability,  first  recall  how  SAMI 
works.  SAMI  moves  numerous  cohorts  forward  one  month  at  a  time.  At 
each  time  point,  some  fraction  of  the  cohort  is  lost,  some  fraction 
reenlists,  and  the  rest  of  the  cohort  is  aged;  also,  new  cohorts  with 
one  month  of  service  are  "accessed."  For  a  given  position  in  the  force 
(e.g.,  1st  term,  4-year  term  of  enlistment,  37  months  of  service),  the 
transition  rates  are  based  on  3-  to  4-year  time  series  of  other  cohorts' 
experiences  while  in  that  same  position. 

SAMl's  predictive  ability  results  from  three  things. 

•  The  observed  errors  are  conditional  on  having  the  right 
accessions  information.  SAMI  uses  this  information. 

•  Transition  rates  tend  to  be  reasonably  stable  over  time. 

•  Distance  to  ETS  explains  much  of  the  variation  in  transition 
rates,  and  SAMI  keeps  track  of  all  cohorts'  positions  relative 
to  ETS.  For  example,  when  SAMI  sees  when  a  large  wave  of 
airmen  approaching  ETS,  it  has  no  trouble  predicting  a  large 
number  of  transitions. 
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Table  6 

TRANSITION  COUNT  PROJECTIONS 


CATENL 

Fiscal 

Year 

Type 

of 

Trans 

Actual 

Prediction 

M-l  M-l/2 

Percentage  Error 

M-l  M-l/2 

all 

1987 

attr 

22246 

23566 

21935 

6.2 

-1.4 

1987 

ets 

35414 

35417 

35164 

.1 

-.7 

1987 

reup 

67748 

69309 

68800 

2.4 

1.6 

1988 

attr 

20009 

20704 

21489 

3.5 

7.4 

1988 

ets 

37690 

36192 

35693 

-3.9 

-5.3 

1988 

reup 

71826 

69871 

74269 

-2.8 

3.4 

1st 

1987 

attr 

16940 

17221 

16619 

1.8 

-1.9 

1987 

ets 

20587 

20683 

20156 

.5 

-2.1 

1987 

reup 

25201 

24834 

25639 

-1.3 

1.7 

1988 

attr 

14872 

15589 

15792 

4.7 

6.2 

1988 

ets 

20793 

20696 

20051 

-.5 

-3.6 

1988 

reup 

25120 

24872 

26391 

-1.0 

5.1 

2d 

1987 

attr 

3619 

4225 

3508 

17.4® 

-3.1 

1987 

ets 

4849 

4911 

5039 

1.3 

3.9 

1987 

reup 

17506 

17772 

17652 

1.6 

.8 

1988 

attr 

3325 

3545 

3824 

6.9 

15. 0b 

1988 

ets 

4625 

4421 

4333 

-8.2 

-10.2 

1988 

reup 

18587 

18236 

19217 

-1.9 

3.4 

career 

1987 

attr 

1629 

2084 

1763 

28. 9C 

8.2 

1987 

ets 

808 

733 

864 

-9.3 

6.9 

1987 

reup 

20097 

21879 

20602 

8.8 

2.5 

1988 

attr 

1785 

1531 

1848 

-13. 8d 

3.5 

1988 

ets 

898 

918 

876 

2.6 

-2.4 

1988 

reup 

22750 

21351 

23103 

-6.3 

1.6 

retire 

1987 

attr 

58 

36 

45 

-37.9® 

-22.4® 

1987 

ets 

9170 

9091 

9104 

-.9 

-.7 

1987 

reup 

4944 

4825 

4906 

-2.4 

-.8 

1988 

attr 

27 

40 

24 

48.1® 

-11.1® 

1988 

ets 

11174 

10157 

10434 

-9.1 

-6.6 

1988 

reup 

5369 

5412 

5559 

.8 

3.5 

®Drop  in  2d- term  attrition  during  all  of  FY87. 

^Upward  shift  in  2d-term  attrition  during  last  half  of  FY88. 


downward  shift  in  career  attrition,  but  small  base  (errors 
in  neighborhood  of  30  per  month). 

Upward  shift  in  career  attrition,  but  small  base  (errors 
in  neighborhood  of  20  per  month) . 


eVery  small  bases  (ACTUAL  =  58  or  27). 
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The  main  requirement  for  SAMI  to  do  well  is  that  there  are  no 
abrupt  changes  in  transition  rates.  For  example,  SAMI' s  biggest  error-- 
the  FY 88  retirement  term--can  be  traced  to  exceptionally  high  retirement 
losses  during  the  last  two  months  of  that  fiscal  year. 

CONSIDERATIONS  FOR  FURTHER  TESTING  AND  EVALUATION 

The  input  data  files  for  any  of  the  proposed  projection  models 
should  be  carefully  studied  for  anomalies  before  they  are  used  in  any 
program.  This  subsection  provides  examples  of  data  problems  encountered 
in  attempting  to  create  a  dataset  used  to  compare  the  performance  of  the 
alternative  SAMI  models. 

In  the  original  dataset,  the  number  of  airmen  increased 
dramatically  in  one  month  (by  almost  4000)  with  no  historical 
verification  of  such  an  event.  In  another  month  the  count  jumped  by 
more  than  2000,  and  then  went  down  by  another  2000  several  months  later. 
Those  jumps  are  too  large  to  be  correct. 

In  FY87,  several  thousand  records  appeared  in  the  PDGL  files  to 
account  for  AFSC  changes.  But  the  code  that  indicated  the  type  of 
transaction  was  not  properly  initialized  in  the  program  that  generated 
the  test  dataset,  so  the  program  counted  several  thousand  more  losses 
and  reenlistments  than  actually  occurred. 

The  data  were  also  contaminated  by  policy  interventions  whose 
effects  are  hard  to  identify  and  remove.  For  example,  reenlistments 
were  affected  by  three  "reup  or  get  out"  policies,  one  in  July  1985, 
another  in  September  1986,  and  a  third  in  April  1987.  These  policies 
not  only  sent  positive  shocks  into  the  reenlistment  rates  series  but 
affected  loss  rates  as  well  (the  extension  option  is  removed,  except  for 
some  airmen  serving  overseas,  so  airmen  approaching  ETS  are  seen  to  exit 
from  the  service  at  higher  than  normal  rates).  For  example,  the  months 
immediately  following  the  April  1987  policy  had  exceptionally  high  ETS 
loss  rates.  Probably  some  airmen  who  normally  would  have  extended 
through  the  end  of  the  fiscal  year  showed  up  as  ETS  losses. 
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Once  the  data  files  have  been  checked  and  inventory  projections 
obtained,  caution  must  still  be  exercised.  Just  because  one  set  of 
plots  looks  more  reasonable  than  another  does  not  guarantee  that  the 
better- looking  plots  identify  a  better  model.  Abrupt  shifts  can  occur 
in  the  series  naturally,  or  the  series  may  be  contaminated  by  policy 
changes,  which  a  bad  model  can  capture  by  accident.  For  example,  if  a 
point  in  the  series  just  before  the  projection  period  happens  to  be  a 
large  positive  outlier,  and  the  actual  data  during  the  projection  period 
have  shifted  upward  as  well,  the  running-average  models  will  predict 
quite  well.  A  simple  comparison  of  actual  and  predicted  data  may  not  be 
conclusive. 

The  Air  Force  will  continue  to  perform  test  and  evaluation  on  the 
robust  and  benchmark  separation  projection  models.  Unfortunately, 
errors  in  prediction  cannot  be  isolated  to  model  misspecif ication  only. 
Policy  actions  will  continue  to  affect  the  data,  and  the  data  will 
continue  to  exhibit  certain  unexplained  shocks.  Nevertheless,  this 
exercise  will  provide  further  understanding  of  the  operating 
characteristics  of  SAMI  and  the  alternative  loss  and  reenlistment 
models . 


-  31 


Appendix 

INVENTORIES  AND  PREDICTION  ERRORS 
THROUGH  END  OF  FISCAL  YEAR 


SAM  was  designed  to  provide  short  term  forecasts  in  a  dynamic 
environment.  It  must  be  able  to  predict  changes  in  the  force  as  the 
year  unfolds.  Air  Force  personnel  planners  need  monthly  force 
projections  at  the  beginning  of  the  fiscal  year  as  well  as  projections 
during  the  year.  The  tables  in  this  appendix  are  presented  for 
reference  purposes,  to  help  gauge  how  accurate  these  models  are  compared 
with  others  that  personnel  planners  might  be  considering.  These  tables 
show  actual  and  projected  inventories,  losses,  and  reenlistments 
beginning  in  October  for  an  entire  fiscal  year  and  beginning  in  each 
subsequent  month  for  the  remainder  of  the  fiscal  year.  The  two  fiscal 
years  that  were  used  in  this  exercise  are  1987  and  1988. 

For  predictions  of  total  inventory  after  losses,  the  percentage 
error  over  all  categories  of  enlistment  rounded  to  zero.  When 
inventories  were  predicted  for  first-term  airmen,  second-term  airmen, 
and  career  airmen,  the  error  was  2  percent  or  less.  Only  the 
predictions  for  the  inventory  in  the  retirement  term  showed  larger 
percentage  errors.  The  errors  of  4  percent,  5  percent,  and  6  percent  in 
the  August  and  September  1988  forecasts  were  the  result  of  a  retirement 
policy  change  that  could  not  be  predicted. 

The  Air  Force  is  primarily  concerned  with  predicting  accurate 
inventories.  But  accurate  inventory  prediction  results  from  correctly 
predicting  losses  and  reenlistments.  Thus,  the  prediction  of  attrition, 
ETS,  retirement  losses,  and  reenlistments  was  also  analyzed.  The 
percentage  errors  in  these  predictions  were  generally  much  larger  than 
for  the  inventory  predictions,  ranging  from  0  to  29  percent.  The  larger 
errors  result  primarily  because  small  numbers  are  more  difficult  to 
accurately  predict  than  large  numbers.  It  is  still  important  to  perform 
this  verification,  allowing  for  larger  errors  but  looking  for  extreme 
outliers  and  patterns  that  would  indicate  data  and/or  forecasting 


errors . 


Inventories  After  Losses:  All  Categories  of  Enlistment,  1987 
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